4 research outputs found

    An in-depth evaluation of multimodal video genre categorization

    Get PDF
    International audienceIn this paper we propose an in-depth evaluation of the performance of video descriptors to multimodal video genre categorization. We discuss the perspective of designing appropriate late fusion techniques that would enable to attain very high categorization accuracy, close to the one achieved with user-based text information. Evaluation is carried out in the context of the 2012 Video Genre Tagging Task of the MediaEval Benchmarking Initiative for Multimedia Evaluation, using a data set of up to 15.000 videos (3,200 hours of footage) and 26 video genre categories specific to web media. Results show that the proposed approach significantly improves genre categorization performance, outperforming other existing approaches. The main contribution of this paper is in the experimental part, several valuable interesting findings are reported that motivate further research on video genre classification

    An audio-visual approach to web video categorization

    Get PDF
    International audienceIn this paper we address the issue of automatic video genre categorization of web media using an audio-visual approach. To this end, we propose content descriptors which exploit audio, temporal structure and color information. The potential of our descriptors is experimentally validated both from the perspective of a classification system and as an information retrieval approach. Validation is carried out on a real scenario, namely on more than 288 hours of video footage and 26 video genres specific to blip.tv media platform. Additionally, to reduce semantic gap, we propose a new relevance feedback technique which is based on hierarchical clustering. Experimental tests prove that retrieval performance can be significantly increased in this case, becoming comparable to the one obtained with high level semantic textual descriptors

    A Visual-based Late-Fusion Framework for Video Genre Classification

    No full text
    International audienceIn this paper we investigate the performance of visual features in the context of video genre classification. We propose a late-fusion framework that employs color, texture, structural and salient region information. Experimental validation was carried out in the context of the MediaEval 2012 Genre Tagging Task using a large data set of more than 2,000 hours of footage and 26 video genres. Results show that the proposed approach significantly improves genre classification performance outperforming other existing approaches. Furthermore, we prove that our approach can help improving the performance of the more efficient text-based approaches

    A Fisher Kernel Approach for Multiple Instance Based Object Retrieval in Video Surveillance

    No full text
    International audienceThis paper presents an automated surveillance system that exploits the Fisher Kernel representation in the context of multiple-instance object Retrieval task. The proposed algorithm has the main purpose of tracking a list of persons in several video sources, using only few training examples. In the first step, the Fisher Kernel representation describes a set of features as the derivative with respect to the log-likelihood of the generative probability distribution that models the feature distribution. Then, we learn the generative probability distribution over all features extracted from a reduced set of relevant frames. The proposed approach shows significant improvements and we demonstrate that Fisher kernels are wellsuited for this task. We demonstrate the generality of our approach in terms of features by conducting an extensive evaluation with a broad range of keypoints features. Also, we evaluate our method on two standard video surveillance datasets attaining superior results comparing to state-of-theartobject recognition algorithms
    corecore